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Description 
Background 

[0001] Speech recognition systems, particularly com- 
puter-based speech recognition systems, are well 
known. Numerous inventions and vote© transcription 
technologies have been developed to address various 
problems within speech recognition systems. In one as- 
pect, advanced mathematics and processing algorithms 
have been developed to address the needs of translating 
vocal input into computer text through speech parsing, 
phoneme identification and database matching of the In- 
put speech so as to accurately transcribe the speech into 
text 

[0002] General speech recognition databases are also 
well known. U.S. Patent No. 6,631,348 (Wymote), for 
example, discloses a speech recognition system In which 
vocal training information Is provided to create different 
vocal reference patterns under different ambient noise 
levels. The Wymore invention creates a database of cap- 
tured speech from this training Input. During operation, 
a user of the Wymore system may then dictate speech 
under various ambient noise conditions and the speech 
recognition system property filters the noise from the us- 
er's input speech based on the different stored models 
to determine the appropriate, spoken words, thereby im- 
proving the accuracy of the speech transcription. 
[0003] U.S. Patent No. 6,662,160 (Chlen et al.) also 
discloses a system involving adaptive speech recogni- 
tion methods that include noise compensation. Like 
Wymore, the system of Chlen et al. neutralizes noise 
associated with input speech through the use of prep rec- 
essed training input. Chlen et al. employs complex sta- 
tistical mathematical models (e.g. Hidden Markov Mod- 
els) and applies optimal equalization factors in connec- 
tion with feature vectors and probability density functions 
related to various speech models so as to accurately rec- 
ognize a user's speech. 

[0004] Other voice transcription systems address the 
problems of minimizing and correcting misrecognition er- 
rors. For example, U.S. Patent No. 6,195.637 (Ballard et 
al.) discloses a transcription system that accepts a user's 
dictation and contemporaneously allows a user to mark 
misrecognized words during the dictation. At the conclu- 
sion of dictation, a computer-based, textual correction 
tool is invoked with which the user may correct the 
marked, misrecognized words. Numerous, potentially in- 
tended words, e.g. words that are close in phonetic dis- 
tance to the actual speech, are provided by the Ballard 
et al. system for possible replacement of the misrecog- 
nized word. Other examples of misrecognized words In- 
clude incorrectly spelled words and Improperly formatted 
words, (e.g. lack of upper case, letters In a name or in- 
correct punctuation). In one embodiment, Ballard et al. 
discloses a computer having a windows-based, graphical 
user interface that displays the list of potentially intended 
words from which the user selects the appropriate word 



with a graphical input device, such as a computer mouse. 
[0005] Other existing speech recognition systems deal 
with problems associated with large, speech recognition 
vocabularies, i.e. the entire English language. These sys- 

5 terns typically address the allocation of the computer- 
based resources required to solve the speech recognition 
problems associated with such a vocabulary. U.S. Patent 
No. 6,430,557 (Jeppesen), for example, discloses a sys- 
tem and method for recognizing and transcribing contin- 
ue uous speech in real time. In one embodiment, the dis- 
closed speech recognition system Includes multiple, ge- 
ographically distributed, computersystems connected by 
high speed links. A portion of the disclosed computer 
system Is responsible for preprocessing continuous 

is speech Input, such as filtering any background noise pro- 
vided during the speech input, and subsequently con- 
verting the resultant speech signals into digital format 
The digital signals are then transcribed into word lists 
upon which automatic speech recognition components 

20 operate. Jeppeson's speech recognition system is also 
trainable so as to accommodate more than one type of 
voice input, Including vocal Input containing different ac- 
cents and dialects. Thus, this speech recognition system 
Is capable of recognizing large vocabulary, continuous 

25 speech input in a consistent and reliable manner, partic- 
ularly, speech that involves variable input rates and dif- 
ferent dialects and accents. Jeppesen further discloses 
systems having on-site data storage (at the site of the 
speech input) and off-site data storage which stores the 

30 databases of transcribed words. Thus, In one aspect, a 
primary advantage of Jeppesen is that a database of 
large scale vocabularies containing speech dictations is 
distributed across different geographical areas such that 
users employing dialects and accents within a particular 

35 country or portion of the world would be able to use lo- 
calized databases to accurately transcribe their speech 
input. 

[0006] Other large vocabulary speech recognition sys- 
tems are directed to Improving the recognition of dictated 

*> input through the use of specialized, hierarchically ar- 
ranged, vocabularies. The computerized, speech recog- 
nition system of U.S. Patent No. 6,526,380 fThelan et al. 
), for example, employs a plurality of speech recognition 
models that accept Incoming speech in parallel and at- 

« tempts to match the speech input within specific data- 
bases. Since the English language vocabulary, for ex- 
ample, Is relative ry large, the speech matching success 
rate using such a large vocabulary for any given particular 
dictation may be lower than what is acceptable for a par- 

50 tfcular application. Thelan et al. attempts to solve this 
problem through the use of specific vocabularies select- 
ed by the voice recognition modules after a particular 
speech vocabulary and associated text database Is de- 
termined to be more appropriately suited to the dictation 

55 at issue. Thus, Thelan et al. begins with an ultra-large 
vocabulary and narrows the text selection vocabularies 
depending on the speech input so as to select further . 
refined vocabularies that provide greater transcription 
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accuracy. Model selectors are operative within Thelan et 
at. to enable the recognition of more specific modeJs if 
the specific models obtain good recognition results. 
These specific models may then be used as replacement 
for the more generic vocabulary model. As with 
Jeppesen, Thelan et at. discloses computer-based 
speech recognition system having potentially distributed 
vocabulary databases. 

[0007] Heretofore, no computerized speech recogni- 
tion systems have been developed that take advantage 
of repeated dictation of specific terms into specific form 
fields or repeated dictation of specific terms by specific 
persons. In particular, context-specific vocabularies or 
context-specific modifications of matching probabilities 
have not been provided with respect a context specific 
vocabulary which is used on conjunction with more gen- 
eral vocabularies. The modern necessity of using spe- 
cific, computerized, form-based input creates a unique 
problem in that the general vocabularies used by many 
oft he commercial speech recognition software programs 
do not provide efficient and accurate recognition and 
transcription of users' input speech. The limitations of the 
present systems lie in the fact that any vocabulary large 
enough to accommodate general as well as specific text 
will have phonetically similar general text so as to cause 
an unacceptably high error rate. 
[0008] WO 01/26093 describes a speech recognition 
system for searching a first grammar file for a matching 
phrase and searching a second grammar file if a match- 
ing phrase is not found In the first grammar file. 
[0009] WO 01/69905 describes a speech recognition 
system that stores caller specific voice files to help rec- 
ognise speech of frequent callers and a generic voice file 
used when a caller does not have a caller-specific de- 
scription. 

[0010] Aspects of the present invention are set out in 
the appended Independent claims. 
[0011] According to a preferred embodiment of the in- 
vention, a method for Improving the accuracy of a com- 
puterized, speech recognition system, the speech rec- 
ognition system including a base vocabulary, the method 
includes loading a specified vocabulary Into computer 
storage, the specified vocabulary associated with a spe- 
cific context; accepting a user's voice input Into the 
speech recognition system; evaluating the user's voice 
input with data values from the specified vocabulary ac- 
cording to an evaluation criterion; selecting a particular 
data value as an input into a computerized form field if 
the evaluation criterion is met; and if the user's voice 
input does not meet the evaluation criterion, selecting a 
data value from the base vocabulary as an input Into the 
computerized form field. According to further embodi- 
ments, the method further Includes evaluating the user's 
voice input with data values from the base vocabulary 
according to a base evaluation criterion if the users voice 
input does not meet the evaluation criterion. According 
to another embodiment, the evaluation criterion is a use 
weighting associated with the data values. The evaluat- 



ing may further Include the step of applying a matching 
heuristic against a known threshold. The step of applying 
a matching heuristic may further include a step of com- 
paring the user's voice input to a threshold probability of 

* matching an acoustic model derived from the specified 
vocabulary. The context is associated with any one or 
more of the following: a topical subject, a specific user, 
and a context Is associated with a field. 
[0012] According to another preferred embodiment of 

*0 the invention, a method for improving the accuracy of a 
computerized, speech recognition system is provided 
that include the steps of loading a first specified vocab- 
ulary into computer storage, the first specified vocabulary 
associated with a first computerized form field; accepting 

15 a user's voice input into the speech recognition system; 
evaluating the user's voice input with data values from 
the first specified vocabulary according to an evaluation 
criterion ; selecting a particular data value as Input Into 
the first computerized form field if the users voice input 

20 meets the evaluation criterion; loading a second specified 
vocabulary Into computer storage, the second specified 
vocabulary associated with a second computerized form 
field; accepting a user's voice input into the speech rec- 
ognition system; evaluating the user's voice input with 

25 against data values from the specified vocabulary ac- 
cording to an evaluation criterion; and selecting a partic- 
ular data value as input into a second computerized form 
held if the user's voice input meets the evaluation crite- 
rion. In one aspect, the evaluation criterion for the steps 

30 of evaluating the first and the second specified vocabu- 
laries are the same. In another aspect, the evaluation 
criterion for the steps of evaluating the first and the sec- 
ond specified vocabularies are different criterion. In still 
another aspect, the first and second computerized form 
35 fields are associated with different fields of a computer- 
ized medical form. 

[0013] In yet another embodiment the present Inven- 
tion provides a method for improving the accuracy of a 
computerized, speech recognition system that includes 

40 loading a first specified vocabulary into computer stor- 
age, the first specified vocabulary associated with a first 
user of the speech recognition system; accepting the first 
user's voice input into the speech recognition system; 
evaluating the first user's voice Input with data values 

45 from the first specified vocabulary according to an eval- 
uation criterion; selecting a particular data value as an 
input into a computerized form field If the first user's voice 
input meets the evaluation criterion; loading a second 
specified vocabulary into computer storage, the second 

so specified vocabulary associated with a second user of 
the speech recognition system; accepting a second us- 
er's voice Input into the speech recognition system; eval- 
uating the second user's voice input with data values 
from the specified vocabulary according to an evaluation 

55 criterion; and selecting a particular data value as an Input 
into the computerized form field if the second user's voice 
Input meets the evaluation criterion. In one aspect, the 
first and second users of the speech recognition system 
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are different doctors and the computerized form fields 
ana associated with afield within a computerized medical 
form. 

[0014] In still another embodiment of the present in- 
vention, a method is provided for improving the accuracy 
of a computerized, speech recognition system that in- 
cludes loading a first specified vocabulary into computer 
storage, the first specified vocabulary associated with a 
first context used within the speech recognition system; 
accepting a user's voice input into the speech recognition 
system; evaJ uating the user's voice input with data values 
from the first specified vocabulary according to an eval- 
uation criterion; selecting a particular data value as an 
input into a computerized form field if the user's voice 
input meets the evaluation criterion; loading a second 
specified vocabulary Into computer storage, the second 
specified vocabulary associated with a second context 
used within the speech recognition system; accepting 
the user's voice input into the speech recognition system; 
evaluating the user's voce input with data values from 
the specified vocabulary according to an evaluation cri- 
terion; and selecting a particular data value as an input 
into the computerized form field if the user's voice input 
meets the evaluation criterion. In one aspect, the first 
context is a patient's age and the second context is a 
patient diagnosis of the patient 

[0015] In still another embodiment of the present in- 
vention, a computerized speech recognition system is 
provided including a computerized form Including at least 
one computerized form field; a firs! vocabulary database 
containing data entries for the computerized form field, 
the first vocabulary associated with a specific criterion; 
a second vocabulary database containing data entries 
for the data field; and an input for accepting a user's vocal 
input, the vocal input being compared to the first vocab- 
ulary as a first pass in selecting an input for the compu- 
terized form field, and the vocal Input being compared to 
the second vocabulary as a second pass In selecting an 
input for the computerized form field. In one aspect, the 
criterion is one ore more of the following: a topical context, 
a specific user of the speech recognition system, a form 
field. In another aspect, the first vocabulary database is 
a subset of the second vocabulary database. 
[0016] In yet another embodiment of the present In- 
vention, a database of data values for use in a compu- 
terized speech recognition system Is provided including 
a first vocabulary database containing data entries for a 
computerized form including at least one computerized 
form field, the first vocabulary associated with a specific 
criterion; and a second vocabulary database containing 
data entries for the data field. In one aspect, the criterion 
is one or more of the following: a topical context, a specific 
user of the speech recognition system, a field. 

Brief Description of the Drawings 

[0017] The invention and Its wide variety of potential 
embodiments will be readily understood via the following 



detailed description of certain exemplary embodiments, 
with reference to the accompanying drawings in which: 
[0016] RG. 1 1s a general network diagram of the com- 
puterized speech recognition system according to one 
5 embodiment of the present Invention; 

[0019] FIG. 2 is a system architecture diagram of a 
speech recognition system according to one embodi- 
ment of the present Invention; 

[0020] FIG. 3 shows an arrangement of a graphical 
10 user interface display and associated data bases accord- 
ing to one embodiment of the present invention; 
[0021] FIG. 4 is a graphical depiction of different text 
string database organizations according to one embod- 
iment of the present invention; 
15 [0022] FIG. 5 is a graphical depiction of one specific, 
text string database according to one embodiment of the 
present invention. 

[0023] FIG. 6 Is a graphical depiction of another spe- 
cific, text string database according to one embodiment 
20 of the present invention; 

[0024] FIG . 7 is a process flow diagram for the speech 
recognition system according to one embodiment of the 
present Invention; and 

[0025] FIG. 8 is another process flow diagram for the 
25 speech recognition system according to anoth er embod- 
iment of the present Invention. 

Detailed Description 

30 [0026] Specific examples of the present invention are 
provided within the following description. Persons of skill 
in the art will recognize that these are merely specific 
examples and that more general uses for the present 
invention are possible. Specifically, in the examples that 

35 follow, the present invention is generally described as It 
pertains to speech recognition within the medical field 
and as It may be used within a medical office. It Is easily 
understood and recognized that other applications of the 
present invention exist In other fields of use, including 

4° use In a general web-based form, or web page. Further, 
the system of the present invention is described as being 
Implemented in software, but hardware and firmware 
equivalents may also be realized by those skilled In the 
art. Finally, the pronoun, "he", will be used in the following 

45 examples to mean either "he' or •she", and "his", will be 
used to mean either "his" or "her*. 
[0027] Fig. 1 shows a general office environment in- 
cluding a distributed computer network for Implementing 
the present invention according to one embodiment 

so thereof. Medical office 100 includes computer system 
105 that ts running speech recognition software, micro* 
phone input 1 1 0 and associated databases and memory 
storage 1 1 5. The computerized system within office 1 
may be used for multiple purposes within that office, one 

ss of which may be the transcription of dictation related to 
the use of certain medical forms within that office. Office 
1 and Its computer system(s) may be connected via a 
link 130 to the internet in general, 140. This link may 
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include any know or future devised connection technol- 
ogy including, but not limited to broadband connections, 
narrow band connections and/or wireless connections. 
Other meolcal offices, for example offices 2 through N, 
151-153, may also be connected to one another and/or s 
to the internet via data links 140 and thus to office i. Each 
of the other medical offices may contain similar computer 
equipment, Including computer equipment running 
speech recognition software, microphones, and databas- 
es. Also connected to internet 140 via data link 162 is 10 
data storage facility 1 70 containing one or more speech 
recognition databases for use with the present Invention. 
[0028] Fig. 2 provides a diagram of a high-level system 
architecture for the speech recognition system 200 ac- 
cording to one embodiment of the present invention. It 1 $ 
should be recognized that any one of the individual pieces 
and/or subsets of the system architecture may be distrib- 
uted and contained within any one or more of the various 
offices or data storage facilities provided in Fig. 1 . Thus, 
there is no preconceived restriction on where any one of zo 
the Individual components within Fig. 2 resides, and 
those of skill In the art will recognize various advantages 
by including the particular components provided in Fig. 
2 in particular geographic and data-centric locations 
shown in Fig. 1. 25 
[0029] Referring to Fig. 2, input speech 205 Is provided 
to the speech recognition system via a voice collection 
device, for example, a microphone 21 0. Microphone 210 
in turn is connected to the computer equipment associ- 
ated with the microphone, shown as 1 05 in Fig. 1 . Com- so 
puter system 1 05 also Includes a speech recognition soft- 
ware system 212. Numerous, commercial speech rec- 
ognition software systems are readlfy available for such 
purpose Including, but not limited to. ViaVoice offered by 
IBM and Dragon Naturally Speaking offered by ScanSoft. 3S 
Regardless of the manufacturer of the product, the 
speech recognition software includes, generally, a 
speech recognition module 217 which is responsible for 
parsing the input speech 205 as digitized by the micro- 
phone 210 according to various, well-known speech rec- 4 ° 
ognftlon algorithms and heuristics. Language model 219 
Is also typically Included with speech recognition soft- 
ware 21 2. In part, the language model 21 9 is responsible 
for parsing the input speech according to various algo- 
rithms and producing fundamental language compo- *s 
nents. These language components are typically created 
in relation to a particular language and/or application of 
interest, which the speech recognition system then eval- 
uates against a textual vocabulary database 220 to de- 
termine a match. In frame-based systems, for example, so 
incoming analog speech is digitized and the amplitude 
of different frequency bands are stored as dimensions of 
a vector. This is performed for each of between 6,000 
and 1 6,000 frames per second and the resulting temporal 
sequence of vectors is converted, by any of various ss 
means, to a series of temporally overlapping tokens* as 
defined In U. S. Pat. # 6,073,097. 
These tokens are then matches with similar temporal se- 
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quences of vectors generated from strings of text in the 
active vocabulary according to the active language model 
end any active set of "learned" user -specific phonetic pat- 
terns and habits. 

[0030] General text database 220 is typically included 
as part of speech recognition software 212 and includes 
language text that is output by the speech recognition 
software once a match with the input speech Is made. 
General or base vocabulary database 220 may contain 
the textual vocabulary for an entire language, e.g. Eng- 
lish. More typical, however, the base vocabulary data- 
base contains a sizable subset of a particular language 
or desired application, e.g. hundreds of thousands of 
words. Those of skill in the arts of database management 
and computer science will realize that certain Inherent 
computational difficulties and computer processing prob- 
lems exist in the use and management of databases of 
this size. The principal barrier to accurate speech match- 
ing (recognition) with large vocabularies is "background 
noise" In the form of sufficient numbers of phonetically 
similar text mismatches in the vocabulary to give an un- 
acceptable frequency of transcription errors. Other prob- 
lems include the latency associated with full database 
searches for textual matches corresponding to input 
speech and the time and computer processing resources 
that must be expended within applications In which the 
base vocabulary database is swappable and must be 
replaced. These problems will arise, for example, with 
rapid swapping of large vocabulary databases In different 
languages. 

[0031 ] Following a textual match from the speech Input 
by speech recognition system 212, the text output from 
base vocabulary database 220 is then provided as input 
to any one of a number of other computer-based appli- 
cations 230 into which the user desires the text. Exam- 
ples of typical computer applications that are particularly 
suited for use with speech recognition software Include, 
but are not limited to word processors, spreadsheets, 
command systems and/or transcription systems that can 
take advantage of a user's vocal input. Alternatively, as 
more text-based applications accompany people's use 
of the internet, for example, such vocal Input may be used 
to provide inputs to text field within a particular form, field 
or web page displayed by an internet browser. 
[0032] Although the initial applications of the present 
invention are directed to voice-to-text applications in 
which vocal input is provided and textual output is de- 
sired, other applications are envisioned in which any user 
or machine provides an input to a recognition system, 
and that recognition system provides some type of output 
from a library of possible outputs. Examples of such ap- 
plications include, but are not limited to a search and 
match of graphical outputs based on a user's voice Input 
or an action-based output (e.g. a computer logon) based 
on a vocal input. One example of an action-based output 
may be to provide access to one of several computer 
systems, the list of computer systems being stored in a 
database of all accessible computer systems based on 
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a user's blo-input (e.g. fingerprint) or a machines' me- 
chanical input (e.g. a login message from a computer). 
[0033] Referring again to Fig. 2, the speech recogni- 
tion/voice transcription system further includes a speci- 
fied database of text string values that provide a first- 
pass output in response to a particular speech Input 
against which the system attempts to determine a match. 
These text strings may be stored in any one of a number 
of formats and may be organized in any one of a number 
of manners depending on the practical application of the 
system. In one particularly preferred embodiment, the 
text strings within specified database 250 are provided 
from the vocal inputs of previous users of the speech 
recognition system. Using the Doctor's office example 
shown in of Fig. 1 f the first-pass text strings may be or- 
ganized by users (e.g. doctors) of the system such that 
those text strings used by a particular doctor are loaded 
by the system as first-pass potential matches when that 
particular doctor logs into the system and/or his vocal 
speech is recognized and identified by the system as 
belonging to that doctor. Sub-databases 261 , 262 and 
263 illustrate such an organization based on users of the 
system. 

[0034) Specified database 250 may also be organized 
according to numerous other criteria that may be advan- 
tageous to users of the speech recognition system in 
another arrangement, the sub-databases of first-pass 
text strings within first-pass, specified database 250 may 
be organized by fields within a computerized or web- 
based electronic form. Using the example of a doctors 
office once again and referring to Fig. 3, text input may 
need to be input into a medical form 310, that includes a 
patient's name, shown In computerized form field 315, 
the patient's address, shown In computerized form field 
31 8, the patient's phone number, shown in computerized 
form field 320, and the patient's age, shown in compu- 
terized form field 320. Sub-databases 371 , 372 and 373 
shown in Fig. 3 are specific examples of the general field 
sub-databases 271, 272 and 273 of Fig. 2. These sub- 
databases provide first-pass text strings for matching 
speech input provided by the doctor when populating 
form fields 315, 318 and 328 (Fig. 3) respectively. 
[0035] As yetanother example of sub-database organ- 
ization within specified database 250, a context associ- 
ated with some aspect of the present speech input (or 
even past speech Input) may be used to organize and 
condition the data into appropriate first-pass sub -data- 
bases. For example, the sub-database 381 associated 
with the findings field 330 within the medical form of Fig. 
3 may be conditioned upon both the history and the age 
of the patient under the presumption that previous find- 
ings related to a particular combination of history and age 
group, either within an individual medical office or In gen- 
eral, are more likely to be repeated In future speech inputs 
with respect to patients having the same combination of 
age range and history. As one example, the findings fields 
populated within a form in the office practice of a primary 
care physician, with a history of abdominal pain and char- 



acteristic physical findings may be quite similar for the 
following two conditions: "appendicitis" as a probable "In- 
terpretation" field for patients age 5-12; and "diverticulitis" 
as a probable "Interpretation" for patients age 75+. Char- 
s acteristic findings (abdominal pain with what Is called "re- 
bound tenderness") will be stored in sub-database 381 
and provided to "findings* field 330. while "appendicitis" 
and "diverticulitis* will be stored In sub-database 382 and 
provided to "Interpretation" field 350. 
10 [0038] Specified database 250 may be created end 
organized in any number of ways and from any one of a 
number of sources of information so as to provide an 
accurate first-pass database for appropriate and efficient 
use within a particular context If, for example, specified 
15 database 250 contains text strings organized by users 
of the system (a user context) under the statistical pre- 
sumption that each specific doctor is more likely to repeat 
his or her own relatively recent utterances than earlier 
utterances, in situations when all other system parame- 
ters are the same, and more likely to repeat terms used 
by other system users or other physicians in the same 
specialty under otherwise identical circumstances, than 
to use terms neither they nor others have used in that 
situation, text from their own past dictations or those of 
others (whether manually or electronically transcribed) 
may be used to populate and arrange the text string val- 
ues within the database, ff, however, a high probability 
first-pass database is used to provide text strings to be 
input Into particular fields within a computerized form, 
then these data values may be derived and input from 
previously fllled-out forms, These data may then be or- 
ganized into sub-databases according to form fields, for 
example as shown in Ffg. 3 by sub-databases 371-381. 
Also, the specified database 250 may contain one, many 
or all such data for use within a particular desired context 
and output application. Finally, the actual data values 
within the database may be dynamically updated and 
rearranged Into different sub-databases during the actual 
use of the speech recognition system so as to accom- 
modate any particularly desirable speech recognition sit- 
uation. In the most useful instances, the data values that 
populate the specified database 250 will be obtained from 
historical data and text strings that accompany a partic- 
ular use and application of the speech recognition sys- 
tem. 

[0037] Supplemental data may also accompany the 
data values and text strings stored within specified data- 
base 250. In particular, weightings and prioritization In- 
formation may be included as part of the textual data 
records that are to be matched to the input speech. These 
weightings may help determine which data values are 
selected, when several possible data values are matched 
as possible outputs In response to a particular speech 
input. Further, these weighting and prioritization informa- 
tion may be dynamically updated during the course of 
the operation of the speech recognition system to reflect 
prior speech input. Those of skill in the art will realize a 
plurality of ways in which the data elements within the 
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specified database may be rearranged and conditioned 
so as to provide an optimal first-pass database for use 
in the speech recognition system of the present Invention. 
[0038] Referring again to Rg. 2, the speech recogni- 
tion/voice transcription system of the present invention 
further includes a context identification module 240. The 
context identification module is coupled to one or more 
input and recognition components (Fig. 2 t 205-230) of 
the overall speech recognition system 200 and is used 
to selector create a proper sub-database within the entire 
specified database 250. If, for example, the desired sub- 
databases to be used are based on a user context, then 
me context identification module may take input from a 
user identification device (not shown) or may determine 
the user from speech characteristics determined by the 
speech recognition software so as to select an appropri- 
ate user sub-database (e.g. 261 ) from the entire specif led 
database 250. Alternatively, the data values within the 
specified database 250 may be loosely organized and 
the context identification module may actually condition 
the data values so as to dynamically create an appropri- 
ate user sub-database from the information stored within 
the specified database. As another example, the context 
identification module may monitor and interpret a partic- 
ular form field that Is active within an application 230 into 
which text input Is to be provided. After making such a 
determination, the context identification module may se- 
lect, or as mentioned above, actually condition the data 
values so as to dynamically create, an appropriate user 
sub-database from the information stored within the 
specified database. 

[0039] Referring again to Rg. 2, the speech recogni- 
ti on/voice transcription system may further include a pri- 
oritization module 245. As with the context identification 
module, the prioritization module may be coupled to any 
one or more Input and recognition components (Fig. 2, 
205-230) within the overall speech recognition system 
200 including the specified database 250. As mentioned 
above and provided In more detail below, the prioritiza- 
tion module assists in collecting actual use information 
from the speech recognition system and using that data 
to dynamically prioritize the data values within any or ail 
of the sub-databases contained within specified data- 
base 250. 

[0040] In one particularly preferred embodiment of the 
present Invention, specified database 250 contains text 
strings as selectable data values for input Into medical 
forms in a word processing application 230. The text 
strings may be organized according to a number of dif- 
ferent criteria based on the users of the forms and/or the 
fields within the electronic forms. As shown in Rg. 3, a 
computer-based electronic medical form 31 0 shows sev- 
eral fields within a medical report For example, compu- 
terized electronic form 31 0 may include a name field 315, 
an address field 318, a phone number field 320. as well 
as more general fields such as a findings field 330 and 
an interpretations field 350. One possible organization 
of the text string data values within specified database 



250 is to associate each text string with each field within 
a particular electronic form. As shown in Rg. 3, text string 
sub -database 371 may be associated with name field 
31 5, text string sub-database 372 maybe associated with 
5 address field 318 and text string sub -database 381 may 
be associated with findings field 330. In this particular 
example, two separate organizations of the text strings 
exist within specified, text string sub-databases 371 
through 382. For single context fields, the name field 31 5 

io for example, sub-database 371 may contain text strings 
that only indicate patient's names. Likewise, text string 
sub-database 372 associated with address field 318 of 
electronic computer form 310 may contain only text 
strings associated with street addresses. 

is [0041] It should be noted that the data organizations 
referenced by 261-283 in Rg. 2 and 379-382 In Rg. 3 
are logical organizations only. The data records within 
specified database 250 may be organized, arranged and 
interrelated in any one of a number of ways, two of which 
are shown In Fig. 4. Referring to Rg. 4, the organization 
of the records within specified database 450 may be 
loose, I.e. all records may be within one file 455 where 
each record (and output text string) contains a plethora 
of relational Information. (Option A). The relational Infor- 
ms mation within the singular file would then, presumably, 
be able to be used to create the logical divisions shown 
In Rgs. 2 and 3. One example of a sub-database might 
be a field context sub-database 471 , for example, where 
the relational data pertaining to the form field within file 

30 455 is used to organize the sub-database. Alternatively, 
organization of the records within specified database 250 
may be tight, i.e. records (and output text strings) may 
be highly organized according to context/fie Id/user such 
that a one-to-one relationship exists between a particular 

35 fife of records (sub-database) and a form field or user, 
as shown In option B of Fig. 4. While the organization 
provided in option B may require more computer memory 
because of the information redundancy needed to create 
all the discrete sub-databases, this disadvantage In the 

to overall database size 450 may be offset by the advantage 
of having smaller physical files 456-458 that can be more 
quickly swapped in and out of computer memory within 
the speech recognition system. In general, those of skill 
In the art will reaiize that different organizations of the 

43 same data will provide various advantages and that such 
data may be organized to optimize any one of number 
of parameters and/or the overall system operation so as 
to enhance the advantages of the present embodiment. 
Rnally, a combination of both database organizations 

so could be used to provide a system that has the advan- 
tages of the present embodiment. 
[0042] Regardless of the data organization of specified 
database 250, two types of specified, sub-databases are 
contemplated. The first type may be classified as a sin- 

ss guiar context sub-database in that one specific criterion 
provides the motivation for grouping and organizing the 
records to create the sub-database. One specific embod- 
iment of the specified, this type of sub-database. 371 of 
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Fig. 3, is shown in more detail in Fig. 5, where text string 
records containing street addresses are stored within 
sub-database 571 in tabular format. In this particular em- 
bodiment, individual records 510, 511 and 512 contain 
text strings of previously dictated (specified) street ad- s 
dresses which are provided for the purpose of matching 
a user's speech input when the address field 318 (Fig. 3) 
is the active dictation field. Other data, such as weighting 
information 552 and user's data 554, may also be Includ- 
ed within text string sub-database 371 . With reference 10 
to the specific example of Rg. 5, the data records within 
the sub-database 571 contain text strings and accompa- 
nying relational data intended for use only within a spe- 
cific field within a computerized form or web page. Other 
specified sub-databases simitar to 571 may contain text *s 
strings and accompanying relational data that is Intended 
for use with only one of the users of the speech recog- 
nition system. 

[0043] In a second sub-database type, multiple context 
organizations of the data within specified database 250 20 
are also created. For example, medical form 31 0 of Rg. 
3 may contain input fields that are related to other Input 
fields within the overall electronic form. This Interrelation- 
ship typically occurs when the voice dictation provided 
as an input to a field within an electronic form is of a more 25 
general nature. In particular, the organization of the text 
strings within a sub-database may not be based on a 
single, external, context, such as a specific user of the 
system or a particular field within an electronic form, but 
rather may be based on the interrelation of the actual text da 
strings in a more complex manner. As one example, con- 
text specific sub-databases 381 (pertaining to the med- 
ical findings field) and 382 (pertaining to the medical in- 
terpretations field} may include contextuaily intertwined 
text strings that the speech recognition system of the 3S 
present invention must identify and property select so as 
to achieve the efficiencies of the present embodiment. 
These more complex, contextuaily, intertwined text strfng 
sub-databases are shown as logical sub-databases 
281 -283 in Fig. 2. " 40 

[0044] A simplified example of the above-mentioned 
text string interrelation is provided below. As shown in 
Fig. 3, sub-database 381 provides text strings that may 
be input into findings field 330 and sub-database 382 
provides text strings that may be input into interpretations <s 
field 350. However, unlike fields with a limited rage of 
accepted input within the electronic computer form, the 
name field 315 for example, sub-database 381 is de- 
signed to match text strings to a more general and varied 
voice Input provided to the speech recognition system, so 
Fig. 6 shows one specific embodiment of the specified, 
text string sub-database 382 of Fig. 3. Sub-database 382 
provides text string records related to medical interpre- 
tations which are stored within sub-database 682 in tab- 
ular format In this particular embodiment, individual ss 
records 615. 616 and 617 contain text strings from pre- 
viously dictated (specified) interpretations which are pro- 
vided for the purpose of matching a user's speech input 
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when the interpretations field 350 (Rg.3) is the active 
dictation field. Other relational data, such as weighting 
Information 652 and interrelational context information 
(e.g. age 654, user 656, findings 658) may also be In- 
cluded within text string sub-database 682. In the exam- 
ple of Rg. 6, interpretations text strings, such as pneu- 
monia and dysphagia, are provided as potential text 
strings to be evaluated against a user's dictation to pro- 
vide a text input to the interpretations field. 
[0045] Also shown in Rg. 6 are, two, similar sounding 
medical terms that have entirely different meanings: dys- 
phagia - a difficulty in swallowing, and dysphasia - an 
Impairment ol speech consisting in lack of coordination 
and failure to arrange words In a proper order. The inter- 
pretations sub-database 682 includes both textual inputs 
as records 616 and 61 7 respectively. Exemplary irtterre- 
lational data are also included as data within the text 
records record of the sub-database. Such data include 
a patient's history 654, a user of the system 656, the 
specific findings regarding the patient 658, as well as a 
general, historical weighting based on the number of 
times the two term have been used 652. During a dicta- 
tion into the interpretations field 350 of electronic form 
310, table 682 is loaded and consulted to achieve the 
best possible textual input for dictated speech. If, for ex- 
ample, the phonetically similar word dysphagia/dyspha- 
sia is dictated into the system then the context interpre- 
tation module would evaluate that voice Input in view of 
any one or combination of contextual data. In one case, 
If the patients past medical history included digestive 
complaints then the more probable textual match, dys- 
phagia, may be selected. Similarly, if the patient's past 
medical history included neurological complaints, the 
term dysphasia may be selected. Similarly, the context 
identification module may rely upon other relational data 
associated with the two text strings to determine the high- 
est probability input If Dr. Brown is a pediatrician and Dr. 
Smith is a geriatric physician, then appropriate weight 
may also be given by the selection system to these pre- 
vious inputs in determining the proper text input for the 
interpretations field. Likewise, the input to the findings 
field 330 may be considered, in which a 'difficulty swal- 
lowing" would result in a more likely match with dysphagia 
and "speech Impairment* would result in a more likely 
indication of dysphasia. In addition, other simple weight- 
ing factors such as the number of times each term has 
been used previously may also be used by the system 
of the present invention to select a more probable input 
text string. Finally, the system may use one, many, or all 
of the aforementioned contextual relationships to deter- 
mine and select the proper text input, possibly after as- 
signing additional weighting function to the Interretational 
data itself, I.e. weighting a user's context higher than the 
age context 

[0046] In operation, a user of the speech recognition 
system of the embodiment inputs speech 205 to micro- 
phone 210 for processing by speech recognition system 
212. As a stand-alone system, speech recognition sys- 
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tern package 212 typically provides a single, general or 
base vocabulary database 220 that acts as a first and 
only database. Because of the sl2e of the database and 
the general nature of the language and the text strings 
contained within It, voice-to-text transcription accuracies 
may vary when the speech recognition system is used 
only with such large, non-speclnc vocabularies. In med- 
ical contexts, for example, inaccuracies in transcription 
of dictation may result In undesirable or even disastrous 
consequences. Thus, the inaccuracies generally tolerat- 
ed by system users must be Improved. Greater transcrip- 
tion accuracy, as well as consistency In the dictation with- 
in fields of an electronic, computer-based form, for ex- 
ample, may be achieved through the use of multiple da- 
tabases containing text strings previously used in differ- 
ent contexts. Specifically, through the proper selection 
of a first -pass database containing a limited but special- 
ized vocabulary and the insertion of this first-pass data- 
base into the existing processing used by commercial 
voice transcription systems, the transcription accuracies 
of these systems can be markedly improved. Failing a 
match In the more specific, first-pass database, the 
speech recognition system can always default to the 
more general, base vocabulary to provide a textual match 
for the dictated input 

[0047] According to various embodiments of the 
present invention, the specified database 250 is used by 
the speech recognition system as a first-pass database 
In selecting an appropriate textual match to the input 
speech 205. The context identification module 240 is re- 
sponsible for selecting and loading (or creating) a partic- 
ular sub-database from specified database 250 during a 
user's dictation so as to provide a high probability of a 
"hit" within that sub-database. The selection process em- 
ployed by context identification module is based on a 
context of the input speech or a context within the dicta- 
tion environment Possfole contexts Include, but are not 
limited to, a particular user of the speech recognition sys- 
tem, a partteularfield within an electronlcform being proc- 
essed by the speech recoojirtion system, or the interre- 
lation of previously input text with a sub-database of text 
that Is likely to be dictated based thereon. 
[0048] Thus, the inherent value of specified database 
250 lies in Its historical precedent as optionally condi- 
tioned with weighting functions that are applied to the 
text strings within the database. Thus, the creation of a 
specified database Is central to its effective use within 
the speech recognition system. 
(0049) Specified database 250 may be created in any 
of a number of manners. In one particularly preferred 
embodiment, past forms may be scanned and digitally 
input into a computer system such that all the text strings 
used within those computer forms are digitized, parsed 
and then stored within the database. The text strings may 
then be subdivided Into specific databases that are ap- 
plicable to specific speech recognition circumstances. 
For example, with respect to the example of addresses 
sub-database shown in Fig. 5, a series of previously re- 



corded paper or electronic medical forms maybe parsed, 
separated and stored such that all the street addresses 
used on those forms are stored in a separate portion 271 
of database 250. Likewise, findings within field 330 and 

5 interpretations within field 330 of the electronic form In 
Fig. 3 may be subdivided from general text string data- 
base 250 to create a specific contextual database of di- 
agnoses for use with a particular medical form. As pre- 
viously described, those of skill in the art will recognize 

w that specified database 250 may be organized in any one 
of a number of different ways to suit the particular needs 
of a particular speech recognition application, such as 
textual input into an electronic form. Such organization 
may take place statically, i.e. before the user employs 

« the voice transcription system, or dynamically, i.e. during 
the use of the voice transcription system. In the dynamic 
context, certain relationships among sub-databases may 
also be leveraged to provide inputs between computer- 
ized form fields. 

[OQ50] Referring to Fig. 7, a general process flow is 
provided for the operation of speech recognition system 
200. The process starts with step 705 in which the speech 
recognition system Is loaded and has begun to operate. 
Specified vocabulary databases may be defined and 

25 loaded here for a particular, more global use during the 
remainder of this process. Next a user of the system is 
identified at step 707. As one example, the user may be 
a particular doctor who wishes to provide speech input 
to a medical form as part of his practice within a practice 

30 group or a medical office. As described above, this user 
1 D may later be used to select appropriate sub-databases 
and associated text strings from specified database 250. 
User identification may be done through speech recog- 
nition, keyboard entry, fingerprinting or by any means 

35 presently known or heretofore developed. Next, voice 
input from the user is provided to the speech recognition 
system in step 710. This vocal input is digitized for use 
within computer system 105 which is then input into the 
speech recognition system employed on that computer 
system as shown In step 720. 

[0051] Next, the context identification module selects 
or creates an appropriate sub-database consisting of a 
subset of the text strings within database 250 as the sys- 
tem's operative first-pass database at step 730. As de- 

45 scribed above, the selection of an appropriate sub-data- 
base may occur according to any one or more of a number 
of different criteria. In one particularly preferred embod- 
iment the criterion on which the sub-database is selected 
is based upon the user of the voice transcription system 

so as provided in step 707. Speciflcalry, any particular user 
may have a historical use of certain words and phrases 
which may serve as a higher probability first-pass source 
of text string data for future use by that particular user. 
Thus, the appropriate selection of that database will re- 

55 suit in higher transcription accuracy and use within the 
speech recognition system. 

[0052] According to another particularly preferred em- 
bodiment of the present invention, the sub-database is 
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selected from the specified database 250 at step 730 
according to the field within the electronic form into which 
text Is being input For example, referring to Fig. 3, when 
a user wishes to populate address field 318 with a par- 
ticular address, the user would indicate to the system at 
step 730 (e.g. through a computer graphical user inter- 
face or a vocal command input) that the address field is 
to be populated. The speech recognition software then 
selects or creates an appropriate sub-database from 
specified database 250 that contains at least the ad- 
dresses for use within that form field. The actual data 
selected and pulled by the context identification module, 
as mentioned above, would typically include related con- 
textual information that would provide insight Into the his- 
torical use of particular addresses so as to provide a high- 
er probability In transcription accuracy. 
[0053] Referring back to Fig. 7, the speech Input pro- 
vided by the user to the speech recognition system at 
step 720 is evaluated by that system with respect to the 
text strings within the sub-database selected in step 740. 
This evaluation maybe performed according to the same 
algorithms and processes used within the speech recog- 
nition system 212 which are used to select matching text 
from its own base vocabulary in database 220. Various 
methods and mechanisms by which the input speech Is 
parsed and' converted to a language output and/or text 
string output are well-known in the art, and these text 
matching mechanisms and evaluation criteria are inde- 
pendent of the other aspects of the present Invention. 
Furthermore, other known evaluation criteria may be 
used on the overall database 250 or the sub-database 
selected in step 730. Such evaluation methods are well- 
known, although particular evaluation criteria that are ap- 
plicable to speech recognition principles may also be em- 
ployed when populating a field within an electronic form. 
As an example, the specific text strings of a particular 
sub-database, such as that shown in Fig. 5 may Include 
a weighting function as shown in field 552 of sub-data- 
base 571 . The weighting field, for example, may include 
the number of times a particular address has been Input 
into a form within a specific historical period. Even with 
thfe over-simplified weighting scheme, ambiguities as be- 
tween two very similar addresses may be easily resolved 
in determining a proper textual match corresponding to 
a speech input. Other weighting schemes, using both 
objective Indicia (e.g. data use count) and subjective in- 
dicia (e.g. weights related to the data itsePf and its inter- 
relation with other data) are well known in the art and 
may also be Included within specific database 571 for 
use in the context identification module. Further, other 
evaluation criteria may be used to select an input text 
string from the sub-database. For example, a most-re- 
cently-used algorithm may be used to select data that 
may be more pertinent with respect to a particular tran- 
scription. Other weighting and evaluation criteria are welt- 
known and those of skill in the art will appreciate different . 
ways to organize and prioritize the data so as to achieve 
optimal transcription accuracy. Finally, a prioritization 



module 245 may be included as part of the speech rec- 
ognition system 200 to Implement and manage the 
above-mentioned weighting and prioritization functions. 
[0054] If the evaluation of the voice input at step 740 
s results In a match within the selected sub-database of 
text strings according to the evaluation criterion , then that 
text string is selected as an output at step 750 and the 
text string is used to populate the desired field within the 
electronic form at step 760. Alternatively, If the evaluation 

to criteria is not met at step 740, the speech recognition 
system would default to base vocabulary database 220 
at step 770, at which point, the speech recognition soft- 
ware would transcribe the user's voice input in its usual 
fashion to select a text string output (step 750) according 

15 to its own best recognition principles and output the same 
to the electronic form (step 760). 
[0055] It should be recognized that the steps provided 
in Fig. 7 may be repetitively performed In a number of 
different ways. For example, as one particular electronic 

2» form Is filled out sequential fields within that form need 
to be designated and then populated with an appropriate 
text string. As such, following the Insertion of a particular 
text string within a particular form field, the process of 
Fig. 7 may return to step 720 where the user Inputs ad- 

& ditional speech input after selecting the new field into 
which the vocal Input is to be transcribed. During this 
second iteration, a second, appropriate sub-database of 
text strings from specified database 250 would be select- 
ed as an appropriate first-pass database for the second 

so field. The process of evaluating and matching the user's 
vocal input with text strings within the second sub-data- 
base, I.e., steps 740 through 770, would operate as men- 
tioned above. 

[0056] In another operative alternative, a second user 

3s may employ the speech recognition system in response 
to which different sub-databases of text strings would to 
be loaded that pertain to the specific use of that second 
user at step 730. In this iterative process, a second user 
would be Identified at step 707, after which the speech 

to Input provided by that second user would be digitized 
and processed by the speech recognition system at step 
720. The selection and/or creation step 730 may or may 
not be performed (again) and may be omitted if the only 
sub-database selection step is conditioned upon a user. 

is The remainder of the process provided In Rg. 7 may then 
be performed to select an appropriate text string as input 
into the fields of the electronic form for that second user. 
[0057] Specific scenarios In which the embodiments 
might be used in a medical office are provided below. 

so [0058] Example #1 : A new radiologist joins a group of 
radiologists who have been using voice recognition tech- 
nology to dictate reports for about two years. Their prac- 
tice has a four year old database of digitally recorded 
imaging studies, linked to a database of the past two 

56 years of computer-transcribed reports as well as several 
years of prior reports manually transcribed to computer 
by transcriptionists listening to voice recordings. The new 
radiologist has "trained* the voice engine to recognize 
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his voice as a new user by engaging in a set of radiology 
voice training exercises that are customized to include 
phrases commonfy used by other members of his group. 
[0059] If the new radiologist's first assignment using 
the system is to dictate a report on a sinus CT scan, the s 
radiologist would identify this report as being for a sinus 
CT scan and click on the "findings' field at which time the 
program will load a specified vocabulary for first pass 
pre-screening composed of text strings that other mem- 
bers of the group have previously used in their dictations ™ 
as input to the "findings" field for sinus CT scans. 
[0060] Since the new radiologist is more likely to use 
terms previously used by his colleagues in dictating re- 
ports of previous sinus CT scans than other x-ray related 
terms that that are phonetically similar, pre-screening the *5 
new radiologist's dictation to match text strings previously 
used by his colleagues, for example, in the findings- 
field, will deliver a higher transcription accuracy than the 
use of a general radiology dictionary or a full English lan- 
guage vocabulary. This is so even if the general radiology 20 
vocabulary has been enriched by "learning" the preferred 
terminology and syntax of his colleagues. When the ra- 
diologist advances to the "Interpretations'' field, the virtual 
vocabulary previously loaded for the "findings" field will 
be unloaded and replaced by a similarly selected virtual z» 
vocabulary for the "Interpretations" field. 
[0081 ] As the new radiologist uses the system, the pri- 
oritization algorithm administered by the prioritization 
module for his specific user sub-database files may as- 
sign relatively higher prioritization scores to his own die- so 
tated text strings vis-a-via the dictated text of his col- 
leagues. Overtime ft will adapt to his personal style, fur- 
ther Improving transcription accuracy. 
[0062] Assume that on his second day of work, the 
new radiologist Is assigned to read studies of the diges- 35 
tlve system, and his first two cases are barium swallow 
studies of the upper gastrointestinal tract. The first case 
is for the evaluation of a two-month old infant suffering 
from vomiting, and the second case Is a follow-up study 
for an 87 year-old man with esophageal strictures. While 4 <> 
the study Is the same, his findings and interpretations in 
the two cases are likely to be different. Depending on the 
number of prior reports in his practice group's database, 
the transcription accuracy of the new radiologist's reports 
may be maximized by applying more complex priorttiza- 
tfon and selection algorithms to the selection of previous - 
ty-used phrases to be loaded for first pass pre-screening. 
The weighting of previously used text strings and the se- 
lection of those data items as first-pass text strings values 
for these reports could result In the assignment of multf- so 
pliers to those data items. These weights could be up- 
dated not only each time the first-pass text strings were 
previously used but also based on the type of study, the 
age of patient and the diagnoses or symptoms listed as 
reasons for physician's request in ordered the study. For 55 
the above-mentioned infant, weighting factors for text 
string prioritization and selection could, for example, be 
based on prior frequency of use in reports of all barium 
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swallow studies in children aged less than 6 months or 
less than one year For the 87 year old man, such prior- 
itization could, for example, be based on the frequency 
of use of those text strings in reporting barium swallow 
studies in patients in any one or more of the following 
classes: patients more than age 60/70/80; use of those 
text strings In reporting barium swallow studies In males 
in these age ranges; prior use of those text strings in 
reporting barium swallow studies in patients with a prior 
diagnosis of esophageal stricture; prior use of those text 
strings in reporting barium swallow studies of patients 
with a prior diagnosis of esophageal stricture by age 
and/or sex; and/or the presence or absence of other 
symptoms (such as swallowing pain or vomiting). Finally, 
the weighting factors related to the presence or absence 
of a symptom, Including associated diagnoses (such as 
status post radiation therapy for a specific type of lung 
cancer) may be listed in the ordering physician's request 
for the procedure or may already be present in the data- 
base of prior diagnoses for that patient 
[0063] There may be an Increased likelihood that text 
strings will be used in a radiology report If they have pre- 
viously been used In reporting the same type of study or 
a related study for the same patient (as when high res- 
olution chest tomography is ordered as a follow up to an 
abnormal chest x-ray). Dictation transcription accuracy 
may thus be Improved by a prioritization algorithm that 
assigns increased weight to text strings that are previ- 
ously used In reporting studies with these types of rela- 
tionship to a study currently being conducted. 
[0064] The larger the group of users that share com- 
mon data and voice match text string sources, the greater 
the extent to which increasingly complex prioritization al- 
gorithms can increase transcription accuracy, in certain 
context driven applications, such as dictations related to 
the practice of medicine, the greaterthe linkage of source 
dictated text to the text strings from which it came, the 
better the ability to retrospectively analyze prioritization 
algorithm performance and compare the efficiency of the 
first-pass vocabulary based on different weighting as- 
signments for different factors in the prioritization algo- 
rithm. This makes It possible to create first-pass data- 
bases for user in large installations, as they accumulate 
data with use, thereby allowing complex prioritization al- 
gorithms, to be optimized based on their own prior expe- 
riences. 

[0065] Example #2: A physician dictates into either a 
computerized medical record database or a structured 
consultation report form as he examines a patient in an 
office setting. In this scenario, the medical report will usu- 
ally begin With a listing of the problem(s)forwhich patient 
Is being seen. These factors, in addition to age and sex, 
server as effective weighting factors so as to allow the 
prioritization of previously-used text strings and load the 
most probable first-pass text strings for each report. Pre- 
vious diagnoses, if noted In an initial consultation or if 
already present In the database from previous diagnosis 
of the same patient, may also be useful as text string 
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weighting factors for sub-database prioritization and se- 
lection. If the patient has been previously seen and his 
or her own previous reports are included In the same 
database, it may be efficient to assign a first multiplier or 
weighting factor to every prior text string used in previous 
reports for that patient and another multiplier or weighting 
factor for each text string uses in the reports for which 
each specific diagnosis is listed among the reasons or 
problems assessed at this visit 

[0066] With respect to electronic forms, a computer- 
ized medical record has functional ly separate data fields. 
In addition, other types of medical reports have structured 
sections. Speech recognition transcription accuracy for 
each such application can be enhanced through the pri- 
oritization and selection of first pass, text string databas- 
es for each such field on the basis of numerous factors 
including, but not limited to: the age and sex of the patient; 
problems listed as reason for that patients visit or to be 
determined during that patient's visit; previously record- 
ed diagnoses for that patient; previous use of text strings 
to be prioritized by that physician in reports for that pa- 
tient; previous use of those text strings with that combi- 
nation of other selection factors by that physician for other 
patients; and/or previous use with that combination of 
other factors by other members of that specialty. 
[0067] As in Example #1 , as each office that uses the 
embodiment accumulates data, It becomes posstote to 
retrospectively analyze prioritization algorithm perform- 
ance and compare the first-pass hit efficiency of different 
weighting assignments for different factors in the priori- 
tization algorithm. This allows the Initial data record se- 
lection scheme to be optimized and permits for a quan- 
titative analysis of the relative efficiency of various prior- 
itization models and weightings for the various offices. 
[0066] The specific embodiment of the present inven- 
tion provided above is somewhat idealistic in that It pre- 
sumes that commercially available speech recognition 
software provides for dynamically loadable databases 
and the possibility to hierarchically direct the speech rec- 
ognition software to sequentially search several such 
loaded databases, including possibly the general or base 
vocabulary that the software Is programmed to operate 
with for most other dictations. Unfortunately* none of the 
speech recognition software packages examined Include 
these general capabilities. Thus, certain improvisations 
have been made with respect to an existing speech rec- 
ognition software package in order to practice the advan- 
tages of the embodiment as described below. 
[0069] In one particular application, the speech recog- 
nition software interfaces with computer operating sys- 
tems according to an industry standard called the 
•Speech Application Programming Interface" protocol, 
abbreviated, "SAPI." SAPI was originally designed for 
the Microsoft™ Windows operating systems. During the 
1 990's a similar protocol called SRAPI was developed 
for non-Windows operating systems but SRAPI lost sup- 
port in the computer industry and current versions of SAPI 
have been applied to non -Windows as well as Windows 



operating systems. 

[0070] SAPI (and, in its day, SRAPI) provide for com- 
puter-based responses to three types of speech input 
application defined commands, user-defined commands 
s (both referred to hereinafteras "commands") and general 
dictation of vocabulary. A signal representing an incom- 
ing item of speech is first screened by the program to 
see if it represents a command, such as, "New para- 
graph," and, If so, executes it as such. Within speech 

to recognition applications such as a word processor, this 
command may cause the insertion of a paragraph break, 
a new-line feed and an indent so as to permit the contin- 
ued dictation in a new paragraph. Incoming speech items 
that are not recognized as commands are transcribed as 

is general vocabulary text, in which the speech recognition 
software looks for the best possible match for the cflctated 
text within combinations of single word text strings loaded 
into the general vocabulary database of the application. 
[0071] Current versions of the SAPI protocol and cur- 

zo rent voice engines only accommodate the loading of one 
vocabulary at a time. However, they accept rapid loading 
and unloading of smallersets of user-defined commands. 
These smaller sets may be as large as the relatively 
small, first-pass vocabularies needed to optimize speech 

25 recognition accuracy for dictation into a computer field. 
The embodiment compasses methods to identify, prior- 
itize and select the high probability text strings which 
would optimize transcription accuracy if used as a first 
pass pre-screening vocabulary. These text strings may 

30 then be translated into user-defined commands which 
are loaded and screened for matches as a first pass "vir- 
tual vocabulary." In this manner, the existing speech rec- 
ognition systems have been tricked into implementing a 
two-pass vocabulary screening model as described 

35 above under present SAPI protocols and with presently 
available voice engines. Incorporation of the methods 
and apparatus of the embodiment ; would be made more 
user-friendly by incorporating the entirety of this embod- 
iment Into future versions of SAPI and into applications 

*0 compliant with such future versions of SAPI, 

[0072] Referring to Fig. 8, a general process flow for 
the operation of the speech recognition system 200 is 
provided as it would be Implemented within a specific 
SAPI speech recognition engine. In general, the steps 

45 are substantially similar to those provided in Fig. 7 with 
the following modifications. At step 740, instead of eval- 
uating the speech input against a set of text strings in the 
selectedfcreated database, the process of Fig. 6 sequen- 
tially evaluates the speech input first, against the data- 

& base of system commands 840, and then, if necessary, 
against the database of user-defined commands 841, 
and then, if necessary, against the database of a first 
vocabulary 842, and then, if necessary, against the da- 
tabase of a second vocabulary 642, and finally, if neces- 

55 sary, against a finai database 844. If a match Is deter- 
mined during any one of these evaluations (steps 
850-853), then either the "command" is executed (steps 
854-655) or a learning function is exercised (steps 
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856-858), and the executed command or selected text 
from a database results in the generation and Insertion 
of the selected text string into a computer form Meld (step 
860). 

[0073] With specific application to Example #1 provid- 5 
ed above, the method provided in the flow diagram of 
Fig. 7 may be modified to operation more efficiently by 
including some of the elements of the process shown in 
Fig. 8. For each context of user (radiologist), type of Im- 
aging study (as chest x-ray or sinus CT), patient demo- to 
graphics (including age, sex, pastmedica! history, reason 
for this study) and field of report, first pass vocabulary 2. 
842 may be provided which includes previous dictations 
by the same user when ail the other variables were iden- 
tical. The second pass vocabulary 843 may be provided is 3. 
which includes dictations by other members of the radi- 
ology group when ail other variables were the same as 
those of the present report The third pass vocabulary 4. 
844 may be provided which includes other dictations by 
the present radiologist into the same field for the same 20 
type of study but for patients with all combinations of age, 5. 
sex, past medical history and reason for study. Thus a 
multiple pass series of specific context dependant sub- 
databases may be provided in actual application before 
the base vocabulary of the speech recognition software & 
Is employed to provide a match. 

[0074] Although the invention has been described with 6. 
reference to specific exemplary embodiments thereof, it 
will be understood thai numerous variations, modifica- 
tions and additional embodiments are possible, and ac- 30 
cordingly, all such variations, modifications, and embod- 7. 
iments are to be regarded as being within the scope of 
the invention. As such, the intended scope of the inven- 
tion is intended to be limited only by the claims of the 
invention and not by any one aspect of the description & 0. 
provided above since the drawings and descriptions are 
to be regarded as Illustrative In nature only. 



Claims 40 

9. 

1. A method of operating a speech recognition system, 
said speech recognition system including a base vo- 
cabulary, the method comprising: 

45 

creating a specified database containing text 
strings, wherein the text strings are provided 
from the inputs of previous use of the system; 
defining a sub-database within the specified da- 
tabase containing text strings associated with a so 
context of input data; 
identifying the context of an input of data; 
loading (705) a specified vocabulary from the 
sub-database into computer storage, said spec- 
ified vocabulary associated with a specific con- ss 
text; 

accepting (710) a user's voice input into sard 10. 
speech recognition system; 



evaluating (740) said user's voice input with data 
values from said specified vocabulary according 
to an evaluation criterion; 
selecting a particular data value as an input into 
a computerised form field if said evaluation cri- 
terion is met; and 

if said user's voice input does not meetsaid eval- 
uation criterion, selecting a data value from said 
base vocabulary as an input Into said computer- 
ised form field. 

The speech recognition method of claim 1 wherein 
said context is associated with said field. 

The speech recognition method of claim 1 wherein 
said context is associated with a topical subject. 

The speech recognition method of claim 1 wherein 
said context is associated with a specific user. 

The method of claim 1 further comprising evaluating 
said user's voice Input with data values from said 
base vocabulary according to a base evaluation cri- 
terion if said user's voice Input does not meet said 
evaluation criterion. 

The method of claim 1 wherein said evaluation cri- 
terion is a use weighting associated with said data 
values. 

The method of claim 1 wherein said step of evaluat- 
ing further includes the step of applying a matching 
heuristic against a known threshold. 

The method of claim 7 wherein said step of applying 
a matching heuristic further includes a step of com- 
paring said user's voice input to a threshold proba- 
bility of matching an acoustic model derived from 
said specified vocabulary. 

A method as claimed in claim 1 , said first specified 
vocabulary is associated with a first computerised 
form field; 

loading a second specified vocabulary from a second 
sub-database into computer storage, said second 
specified vocabulary associated with a second com- 
puterised form field; 

accepting a user's further voice input Into said 
speech recognition system; 
evaluating said user's voice input with data values 
from said specified vocabulary according to an eval- 
uation criterion; and 

selecting a particular data value as input Into a sec- 
ond computerised form field if said user's voice input 
meets said evaluation criterion. 

The method of claim 9 wherein said evaluation cri- 
terion for said steps of evaluating said first and said 
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second specified vocabularies are the same. 

11. The method of claim 9 wherein said evaluation cri- 
terion for said steps of evaluating said first and said 
second specified vocabularies are different criterion, s 

12. The method of claim 9 wherein said first and second 
computerised form fields are associated with differ- 
ent fields of a computerised medical form. 

TO 

13. A method as daimed in claim 1 , comprising: 

toading a second specified vocabulary from a 
second sub-database into computer storage, 
said second specified vocabulary associated is 
with a second user of said speech recognition 
system ; 

accepting a second user's voice input Into said 
speech recognition system; 
evaluating said second user's voice Input with 20 
data values from said specified vocabulary ac- 
cording to an evaluation criterion; and 
selecting a particular data value as an input into 
said computerised form field If said second us- 
er's voice input meets said criterion. zs 



inputs of previous use of the system, and to de- 
fine a sub-database within the specified data- 
base containing text strings associated with a 
context of input data; 

a context identification module (240) adapted to 
identify the context of an input of data; 
the processing means being further adapted to 
load (706) a specified vocabulary from the sub- 
database into computer storage, said specified 
vocabulary associated with a specific context; 
to accepting (710) a user's voice input into said 
speech recognition system; 
to evaluate (740) said user's voice Input with da- 
ta values from said specified vocabulary accord- 
ing to an evaluation criterion; 
to select a particular data value as an input into 
a computerised form Held if said evaluation cri- 
terion is met; and 

if said user's vo'rce input does not meet said eval- 
uation criterion, to select a data value from said 
base vocabulary as an Input into said computer- 
ised form field. 

18. The speech recognition system of claim 17 wherein 
said context is a topical context. 



19. The speech recognition system of claim 17 wherein 
said context is associated with a specific user of said 
speech recognition system. 

30 

20. The speech recognition system of claim 17 wherein 
said context Is associated with said field. 

21. A database for a speech recognition system, the da- 
35 tabase containing text strings, wherein the text 

strings are provided from the inputs of previous use 
of the system, the database comprising a plurality of 
sub-databases, each containing a respective vocab- 
ulary associated with a respective context of input 
4* data. 



14. The method of ciaim 1 3 wherein said first and second 
users of said speech recognition system are different 
doctors and said computerised form fields are asso- 
ciated with a field within a computerised medical 
form. 

16. A method as claimed in claim 1 comprising: 

loading a second specified vocabulary from a 
second sub-database into computer storage, 
said second specified vocabulary associated 
with a second context used within said speech 
recognition system; 

accepting said user's further voice Input into said 
speech recognition system; 
evaluating said user's voice input with data val- 
ues from said specified vocabulary accordi ng to 
an evaluation criterion; and 
selecting a particular data value as an input into 
said computerised form field "rf said user's voice 
input meets said evaluation criterion. 

16. The method of ciaim 15 wherein said first context is 
a patient's age and said second context is a patient 
diagnosis of said patient. 

17. A speech recognition system including a base vo- 
cabulary, the system comprising: 

processing means (240,246) adapted to create 
a specified database containing text strings, 
wherein the text strings are provided from the 



22. A computerprogram comprising Instructions for con- 
trolling a processor of a speech recognition system 
when executed to carry out ail of the steps of a metri- 
cs od as claimed in any one of claims 1 to 1 6. 

PatentansprQche 

so 1. Verfahren zum Betrieb eines Spracherkennungssy- 
stems mit einem Basisvokabular, wobei in dern Ver- 
fahren 

eine spezifizierte Datenbank mlt Textelementen er- 
zeugt wird, wobel die Texteiemente aus den Einga- 
55 ben elner vorherigen Benutzung des Systems be- 
reitgestelitwerden, 

eine Sub-Datenbanklnnerhalbderspezifizierten Da- 
tenbank definiert wind, die Textefemente enthaft, die 
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einem Kontext von Eingabedaten zugeordnet sind, 
der Kontext von elngegebenen Daten identffizlert 
wird, 

ein spezif iziertes Vokabular aus der Sub-Datenbank 
in einen Computerspeicher geladen (705) wird, wo- 
bei das spezifizierte Vokabular einem spezlfischen 
Kontext zugeordnet 1st, 

eine Spracheingabe einas Benutzers in das Sprach- 
erkennungssystern aufgenommen (710) wird, 
die Spracheingabe des Benutzers mittels Datenwer- 
ten von dem spezifizierte n Vokabular gemaB einem 
Bewertungskriterium bewertet (740) wird, 
ein bestimmter Datenwert als Elngabe In ein com- 
puterisiertes Formularfald auegewahrt wird, falls das 
Bewertungskriterium erfOItt 1st, and 
fails die Spracheingabe des Benutzers das Bewer- 
tungskriterium nlcht erfultt, ein Datenwert von dem 
Basisvokabular als eine Eingabe in das computerl- 
sierte Formularfeld ausgewahft wird. 

2. Spracherkennungsverfahren nach Anspruch 1, wq- 
bei der Kontext dem Feld zugeordnet 1st. 

3. Spracherkennungsverfahren nach Anspruch 1, wo- 
bei der Kontext einem thematischen Betreff zuge- 
ordnet ist. 

4. Spracherkennungsverfahren nach Anspruch 1 , wo- 
bei der Kontext einem spezif ischen Benutzer zuge- 
ordnet ist. 

5. Verfahren nach Anspruch 1, wobei die Sprachein- 
gabe des Benutzers femer mittels Datenwerten aus 
dem Basisvokabular gemaB einem Basis bewer- 
tungskriterium bewertet wird, falls die Spracheinga- 
be des Benutzers das Bewertungskriterium nlcht er- 
futft 

6. Verfahren nach Anspruch 1 , wobel das Bewertungs- 
kriterium in einer den Datenwerten zugeordneten 
Benutzungsgewichtung besteht 

7. Verfahren nach Anspruch 1, wobei beim Bewer- 
. tungsschritt f erner eine Ubereinstimmungsheuristik 

gemSB einem bekannten Schwellenwert angewandt 
wird. 

8. Verfahren nach Anspruch 7, wober beim Anwenden 
der Ubereinstimmungsheuristik femer die Sprach- 
eingabe des Benutzers mtt einer Grenzwahrscheln- 
lichkeit verglichen wird, mit einem aus dem spezrfi- 
zierten Vokabular abgeJeiteten akustischen Mode II 
ubereinzustimmen. 

9. Verfahren nach Anspruch 1 , wobei das erste spezi- 
fizierte Vokabular einem ersten computerisierten 
Formularfeld zugeordnet ist, 

ein zweites speziflziertes Vokabular aus einer zwei- 
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ten Sub-Datenbank in den Computerspeicher ge la- 
den wird, wobei das zweite spezifizierte Vokabular 
einem zweiten computerisierten Formularfeld zuge- 
ordnet ist, 

5 eine weitere Spracheingabe eines Benutzers in das 
Spracherkennungssyetem aufgenommen wird, 
die Spracheingabe des Benutzers mittels Datenwer- 
ten von dem spezifizierten Vokabular gemaB einem 
Bewertungskriterium bewertet wird, und 

10 ein bestimmter Datenwert als Eingabe in ein zweites 
computerisiertes Formularferd ausgewfihltwird, falls 
die Spracheingabe des Benutzers das Bewertungs- 
kriterium erfullt. 

is 10. Verfahren nach Anspruch 9, wobei die Bewertungs- 
kriterienfurdie BewertungsschrfttebezOgilchdeser- 
sten und des zweiten spezifizierten Vokabulars 
gleich sind. 

20 11. Verfahren nach Anspruch 9, wobel die Bewertungs- 
kriterien f Qrdle Bewertungsschritte bezuglich des er- 
sten und des zweiten spezifizierten Vokabulars ver- 
schieden voneinandersind. 

25 12. Verfahren nach Anspruch 9, wobei das erste und 
das zweite computerisierte Formularfeld verschle- 
denen Feldem eines computerisierten medizini- 
schen Formulars zugeordnet sind. 

30 13. Verfahren nach Anspruch 1, wobei 

ein zweites spezif iziertes Vokabular aus einer zwei- 
ten Sub-Datenbank In einen Computerspeicher ge- 
laden wird, wobei das zweite spezifizierte Vokabular 
einem zweiten Benutzer des Spracherkennungssy- 

35 stems zugeordnet 1st, 

eine Spracheingabe des zweiten Benutzers in das 
Spracherkennungssystem aufgenommen wird, 
die Spracheingabe des zweiten Benutzers mittels 
Datenwerten aus dem spezifizierten Vokabular ge- 

to maB einem Bewertungskriterium bewertet wird, und 
ein bestimmter Datenwert als eine Eingabe in das 
computerisierte Formularfeld ausgewahlt wird, falls 
die Spracheingabe des zweiten Benutzers das Kri- 
terium erfOllt. 

45 

14. Verfahren nach Anspruch 13, wobei der erste und 
der zweite Benutzer des Spracherkennungssystems 
verschledene Arzte sind und die computerisierten 
Formularfelder einem Feld innerhalb eines compu- 

so terisierten medizinischen Formulars zugeordnet 
sind. 

15. Verfahren nach Anspruch 1, wobel 

ein zweites speziflziertes Vokabular aus einer zwei- 
55 ten Sub-Datenbank in den Computerspeicher gela- 
den wird, wobei das zweite spezifizierte Vokabular 
einem zweiten innerhalb des Spracherkennungssy- 
stems benutzten Kontext zugeordnet ist 
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erne weitere Spracheingabe des Benutzers in das 
Spracherkennungssystem aufgenommen wird, 
die Spracheingabe des Benutzers mlttels Daienwer- 
ten aus dem spezif izierten Vokabular gemafi einem 
Bewertungskrtterium bewertet wird, und s 
ein bestimmter Datenwert als eine Eingabe In das 
computerlslerte Formularfeld ausgewahlt wird, falls 
die Spracheingabe des Benutzers das Bewertungs- 
krtterium erfOlft. 

70 

16. Verfahren nach Anspruch 15, wobei der erste Kon- 
text das Alter el nes Patlenten und der z we ite Kontext 
elne Diagnose des Patlenten 1st 

17. Spracherkennungssystem mit einem Baslsvokabu- is 
lar, mit 

elner Verarbeitungseinrichtung (240, 245), die dazu 
ausgelegt 1st, elne spezifizierte Datenbank mrtTex- 
telementen, die aus den Eingaben einer vorherigen 
Benutzung des Systems bereitgestellt sind, zu er- 20 
zeugen und eine Sub-Datenbank innerhaib der spe- 
ziflzlerten Datenbank zu definieren, die einem Kon- 
text der Elngabedeten zugeordnete Textelemente 
enthalt, 

einem Kontextiderrtifikationsmodul (240), das dazu & 
ausgelegt 1st, den Kontext von Eingabedaten zu 
identifrzieren, 

wobei die Verarbeitungseinrichtung ferner dazu aus- 
gelegt 1st, 

ein spezif Izlertes Vokabufar aus der Sub-Datenbank 30 
In ein en Computerspeicher zu laden (705), wobei 
das spezifizierte Vokabular einem spez if ischen Kon- 
text zugeordnet 1st, 

eine Spracheingabe eines Benutzers in das Sprach- 
erkennungssystem aufzunehmen (710), 35 
die Spracheingabe des Benutzers mittels Datenwer- 
ten aus dem spezifizierten Vokabular gemaB einem 
Bewertungskriterium zu bewerten (740), 
einen bestimmten Datenwert als eine Eingabe in ein 
computerlslertes Formularfeld auszuwahlen, falls 
das Bewertungskriterium erfOlrt ist, und 
falls die Spracheingabe des Benutzers das Bewer- 
tungskriterium nichterf Qllt, einen Datenwert aus dem 
Basisvokabular als Eingabe in das computerisJerte 
Formularfeld auszuwahlen. 45 

1 8. Spracherkennungssystem nach Anspruch 1 7, wobei 
der Kontext ein thematlscher Kontext 1st. 

19. Spracherkennungssystem nach Anspruch 17, wobei so 
der Kontext einem speziflschen Benutzer des 
Spracherkennungssystems zugeordnet ist 

20. Spracherkennungssystem nach Anspruch 1 7, wobei 
der Kontext dem Feld zugeordnet ist ss 

21. Datenbank fur ein Spracherkennungssystem, wobe) 
die Datenbank Textelemente enthatt, die aus den 



Eingaben einer vorherigen Benutzung des Systems 
bereitgestellt sind, wobei die Datenbank mertrere 
Sub-Datenbanken aufweist, die jeweils ein Vokabu- 
lar enthelten, das einem entsprechenden Kontext 
von Eingabedaten zugeordnet ist. 

22. Computerprogramm mit Befehlen zum Steuem ei- 
nes Prozessors eines Spracherkennungssystems, 
urn bei AusfQhrung des Programms alle Schrltte ei- 
nes Verfahrens gemaB einem der Anspruche 1 bis 
I6dunchzuf0hren. 



Revendicatlons 

1 . Precede de mise en oeuvre d'un systeme de recon- 
naissance vocale, ledit systeme de reconnaissance 
vocale comprenant un vocabulaire de base, le pro- 
cede comprenant : 

fa creation d'une base de donnees specifiee 
contenant des chaines de texte, tes chaTnes de 
texte etant fournies a parti r des entrees d'une 
utilisation precedente du systeme ; 
la definition d'une sous-base de donnees dans 
(a base de donnees specifiee, contenant des 
chaTnes de texte associe es a un contexte de 
donnees d'entree ; 

{'Identification du contexte d'une entree de 
donnees ; 

le charge ment (705) d'un vocabulaire spec (fie 
depuis la sous-base de donnees dans une me- 
moire d'ordlnateur, ledit vocabulaire specifie 
etant associe a un contexte speciflque ; 
r acceptation (710) d'une entree vocale d'utilisa- 
teur dans ledit systeme de reconnaissance 
vocale ; 

revaluation (740) de ladite entree vocala d'utili- 
sateur avec des valeurs de donnees provenant 
dudit vocabulaire specifie conformernent a un 
critere devaluation ; 

la selection d'une vaieur de donnee particuliere 
en tant qu'entree dans un champ de forme in- 
formatisee si ledit critere devaluation est 
satisfait ; et 

si ladite entree vocale cfutMsateur ne satisfait 
pas auxdrts criteres devaluation, la selection 
cf une vaieur de donnee a parOr dudit vocabulai- 
re de base en tant qu'entree dans ledit champ 
de forme informatlsee. 

2. Precede de reconnaissance vocale selon la reven- 
dicatlon 1, dans lequel ledit contexte est assocte 
audit champ. 

3. Prooede de reconnaissance vocale selon la reven- 
dicatlon 1 , dans lequel ledit contexte est associe a 
un sujet topic] ue. 
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4. Precede cte reconnaissance vocale selon la reven- 
dication 1 , dans lequel ledlt contexte est assocle a 
un utilisateurspecrfique. 

5. Precede selon la revendication 1, comprenant en 
outre revaluation de ladite entree vocale d'utilisateur 
avec des valeurs de donnees provenant dudit voca- 
bulaire de base conformement a un critere ©"evalua- 
tion de base si ladite entree vocale d'utlllsateur ne 
satisfait pas audit critere devaluation. 

6. Proceed selon la revendication 1 , dans lequel (edit 
crttere devaluation est une ponderation d'utiHsatlon 
associee auxdites valeurs de donnees. 

7. Procede selon la revendication 1 , dans lequel ladite 
etape devaluation comprend en outre t'etape d'ap- 
plfcation d'une heuristique d'adaptatlon par rapport 
a un seuil connu. 

8. Procede* selon la revendication 7, dans lequel ladite 
etape d'application d'une heuristique ©"adaptation 
comprend en outre une etape de comparaison de 
ladite entree vocale d'utilisateur a une probabrDte de 
seuil d'adaptation d'un modele acoustique derive du- 
dtt vocabulaire specifie. 

9. Procede selon la revendication 1, dans lequel ledlt 
premier vocabulaire specific est assocle a un pre- 
mier champ de forme informatisee ; 

le chargement cfun second vocabulaire specific de- 
puis une seconde sous-base de donnees dans une 
memoire d'ordinateur, I edit second vocabulaire spa- 
rine etant assccie a un second champ de forme 
Inform atisee ; 

I'acceptation d'une autre entree vocale d'utilisateur 
dans ledit systeme de reconnaissance vocale ; 
revaluation de ladite entree vocale d'utilisateur avec 
des valeurs de donnees provenant ducflt vocabulaire 
specifie conformement a un critere devaluation ; et 
. la selection d'une valeur de donn6e particullere en 
tant qu 'entree dans un second champ de forme fn- 
formatisee si ladite entree vocale d'utilisateur satis- 
fait audit critere devaluation. 

10. Precede selon la revendication 9, dans lequel ledit 
critere devaluation pour iesdites etape s devaluation 
desdits premier et second vocabulalres specifies est 
le mdme. 



1 1 . Proc6d6 selon la revendication 9, dans lequel lesdits 
criteres devaluation pour Iesdites etapes d' evalua- 
tion desdits premier et second vocabulaires speci- 
fies sont des criteres difterents. 

1 2. Procede selon la revendication 9, dans lequel lesdits 
premier et second champs de forme Informatisee 
sont associes a differents champs d'une forme me- 



dicate informatisee. 

13. Procede selon la revendication 1 , comprenant : 

le chargement d'un second vocabulaire specifie 
depuis une seconde sous-base de donnees 
dans une memoire d'ordinateur, (edit second vo- 
cabulaire specifie etant associe a un second utl- 
lisateur dudh systeme de reconnaissance 
10 vocale ; 

' I'acceptation d'une seconde entree vocale d'uti- 
lisateur dans ledit systeme de reconnaissance 
vocafe ; 

revaluation de ladite seconde entree vocale 
15 d'utilisateur avec des valeurs de donnees pro- 

venant dudit vocabulaire specifie conformement 
a un critere devaluation ; et 
la selection d'une valeur de donnee particullere 
en tant qu'entree dans ledit champ de forme in- 
20 formetisee si ladrte seconde entree vocale d' utl- 

lisateur satisfait audit critere. 



14. 



Procede selon la revendication 13, dans lequel les- 
dits premier et second utilisateurs dudlt systeme de 
reconnaissance vocale sont des medeci ns differents 
et lesdits champs de forme informatisee sont asso- 
cies a un champ dans une forme medlceJe informa- 
tisee. 



30 15. Procede selon la revendication 1 , comprenant : 



le chargement d'un second vocabulaire specifie 
depuis une seconde sous-base de donnees 
dans une memoire d'ordinateur, ledlt second vo- 
cabulaire specifie etant assccie a un second 
contexte utilise dans ledit systeme de recon- 
naissance vocale ; 

I'acceptation de ladite autre entree vocale d'uti- 
lisateur dans ledit systeme de reconnaissance 
vocale; 

revaluation de ladite entree vocale d'utilisateur 
avec des valeurs de donnees provenant dudit 
vocabulaire specifie conformement a un critere 
devaluation ; et 

la selection d'une valeur de donnee particuliere 
en tant qu'entree dans ledlt champ de forme in- 
formatisee si ladite entree vocale d'utilisateur 
satisfait audit crttere devaluation. 



35 



40 



SO 



55 



16. Procede selon la revendication 15, dans lequel ledit 
premier contexte estraged'un patient et I edit second 
contexte est un diagnostic dudit patient. 

17. Systeme de reconnaissance vocale comprenant un 
vocabulaire de base, le systeme comportarrt : 

un moyen de traitement (240, 245) concu pour 
creerune base de donnees specif tee contenant 
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des chalnes de texts, les chaines de texte etant 
produltes & partir des entrees d'une utilisation 
precedents du systeme, et pour definir une 
sous-base de donnees dans la base de donnees 
specfflee contenant des chalnes de texte asso- s 
ciees a un contexte de donnee d'entree ; 
un module (240) ^identification de contexte 
concu pour identifier le contexte d'une entree de 
donnee; 

(e moyen de traitement etant en outre concu io 
pour charger (705} un vocabulaire specific de- 
puis la sous-base de donnees dans une memoi- 
re d'ordinateur, ledlt vocabulaire specif ie etant 
associe a un contexts specifique ; 
pour accepter (71 0) une entree vocale o"utl))sa- 15 
teur dans led it systeme de reconnaissance 
vocale ; 

pourevaiuer (740) ladite entree vocale d'utilisa- 
teuravec des valeurs de donnees provenant du - 
cfit vocabulaire specif ie conformement a un erf- 20 
tare devaluation ; 

pour selectionner une valeur de donnee partl- 
culiere en tent qu'entree dans un champ de for- 
me Informatisee si ledlt critere devaluation est 
satisfait ; et 25 
si ladite entree vocale d'utifisateur ne satisfait 
pas audit critere devaluation, pour selectionner 
une valeur de donnee provenant dudlt vocabu- 
laire de base en tant qu'entree dans ledit champ 
de forme informatisee. so 

18. Systeme de reconnaissance vocale selon la reven- 
dlcatlon 17, dans lequel ledit contexte est un con- 
texte topique. 

35 

19. Systeme de reconnaissance vocale selon la reven- 
dication 17, dans lequel ledit contexte est associe a 
un utilisatsur specif iqus dudit systeme de reconnais- 
sance vocale. 

40 

20. Systeme de reconnaissance vocale selon ia reven- 
dication 17, dans lequel ledit contexte est associe 
audit champ. 

21. Base de donnees pour un systeme de reconnais- *s 
sance vocale, la base de donnees contenant des 
chaines de texte, les chaines de texte etant fourntes 
depuis les entrees d'une utilisation precedent e du 
systems, la base de donnees comprenant de multi- 
ples sous-bases de donnees qui contlennent chacu- so 
ne un vocabulaire respectlf associe a un contexte 
respectrf de donnees d'entree. 

22. Programme d*ordinateur comportant des instruc- 
tions pour commander un processeur d'un systeme & 
de reconnaissance vocale lorsqu'il est mis en oeuvre 
pour executer routes les etapes d'un procede selon 
Tune quelconque des revendications 1 a 1 6. 
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